Robot Audition – Hands - Free Automatic Speech Recognition under Highly - Noisy Environemnts – Kazuhiro NAKADAI

نویسنده

Hiroshi G. OKUNO

چکیده

This paper addresses robot audition, which realizes listening capabilities for robots using robot-embedded microphones. For robot audition, we propose real-time sound source separation and automatic speech recognition (ASR) techniques for dynamically changing environments based on microphone array processing, which is applicable to hands-free ASR under highly-noisy environments. Implementation of the proposed techniques is open-sourced as robot audition software called “HARK.” We show the effectiveness of these techniques through applications of HARK to robots.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Two-layered audio-visual integration in voice activity detection and automatic speech recognition for robots

Automatic Speech Recognition (ASR) which plays an important role in human-robot interaction should be noise-robust because robots are expected to work in noisy environments. Audio-Visual (AV) integration is one of the key ideas to improve the robustness in such environments. This paper proposes two-layered AV integration for ASR which applies AV integration to Voice Activity Detection (VAD) and...

متن کامل

Audio-visual speech recognition system for a robot

Automatic Speech Recognition (ASR) for a robot should be robust for noises because a robot works in noisy environments. Audio-Visual (AV) integration is one of the key ideas to improve its robustness in such environments. This paper proposes AV integration for an ASR system for a robot which applies AV integration to Voice Activity Detection (VAD) and speech decoding. In VAD, we apply AV-integr...

متن کامل

Design and Implementation of Robot Audition System 'HARK' - Open Source Software for Listening to Three Simultaneous Speakers

This paper presents the design and implementation of the HARK robot audition software system consisting of sound source localization modules, sound source separation modules and automatic speech recognition modules of separated speech signals that works on any robot with any microphone configuration. Since a robot with ears may be deployed to various auditory environments, the robot audition sy...

متن کامل

Hands-free Speech Recognition Robust to distance and Azimuth in Robot Application

In this paper we present two methods in addressing the changes in radial position and azimuth, respectively, relative to the robot and speaker. In the case of the former, room transfer function (RTF) estimation is employed via waveformlevel compensation to reflect the change in power caused by the change of radial position to the RTF. In addition, acoustic model-level compensation is also used ...

متن کامل

Leak energy based missing feature mask generation for ICA and GSS and its evaluation with simultaneous speech recognition

This paper addresses automatic speech recognition (ASR) for robots integrated with sound source separation (SSS) by using leak noise based missing feature mask generation. The missing feature theory (MFT) is a promising approach to improve noise-robustness of ASR. An issue in MFT-based ASR is automatic generation of the missing feature mask. To improve robot audition, we applied this theory to ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2011

Robot Audition – Hands - Free Automatic Speech Recognition under Highly - Noisy Environemnts – Kazuhiro NAKADAI

نویسنده

چکیده

منابع مشابه

Two-layered audio-visual integration in voice activity detection and automatic speech recognition for robots

Audio-visual speech recognition system for a robot

Design and Implementation of Robot Audition System 'HARK' - Open Source Software for Listening to Three Simultaneous Speakers

Hands-free Speech Recognition Robust to distance and Azimuth in Robot Application

Leak energy based missing feature mask generation for ICA and GSS and its evaluation with simultaneous speech recognition

عنوان ژورنال:

اشتراک گذاری